Thermodynamics of DNA loops with long-range correlated structural disorder 
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LRC, smaller the loop size. We use the mean first passage 
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The dynamics of folding and unfolding of DNA within 
living cells is of fundamental importance in a host of bi- 
ological processes ranging from DNA replication to gene 
regulation As the basic unit of eukaryotic chro- 

matin organization, the structure and dynamics of nucle- 
osomes has attracted increasing experimental and theo- 
retical interest @. High resolution X-ray analyses Q 
have provided deep insight into the wrapping of 145 bp 
of DNA in almost two turns around an histone octamer 
to form a nucleosome core. Recent experiments have 
shown that nucleosomes are highly dynamical structures 
that can be moved along DNA by chromatin remodeling 
complexes Q but that can also move autonomously on 
short DNA segments Q . Different models have been pro- 
posed to account for the nucleosome mobility 6] includ- 
ing the DNA reptation model that involves intranucleo- 
somal loop diffusion 0] and the nucleosome reposition- 
ing model via an extranucleosomal loop Q; both mod- 
els provide an attractive picture of how a transcribing 
RNA polymerase can get around nucleosomes without 
dissociating it completely. Since the discovery of nat- 
urally curved DNA [9(, several works have investigated 
the possibility that the DNA sequence may facilitate the 
nucleosome packaging [10| in the same manner as it can 
highly promote very small loop formation . Recently, 
a comparative statistical analysis of eukaryotic sequences 
and their corresponding DNA bending profiles [12| has 
revealed that LRC in the 10 — 200 bp range are the sig- 
nature of the nucleosomal structure and that over larger 
distances (> 200 bp) they are likely to play a role in the 
condensation of the nucleosomal string into the 30 nm- 
chromatin fiber. To which extent sequence-dependent 
LRC structural disorder does help to regulate the struc- 
ture and dynamics of chromatin is of fundamental impor- 
tance as regards to the potential structural informations 
that may have been encoded into DNA sequences during 
evolution. A possible key to the understanding is that the 
LRC structural disorder induced by the sequence may fa- 
vor the formation of small (few hundreds bp) DNA loops 
and in turn the propensity of eukaryotic DNA to interact 



FIG. 1: Left: "Spontaneous" trajectory for two 2D semi- 
flexible chains with uncorrelated (H = 0.5, black) and LRC 
(H = 0.8, grey) structural disorder. Right: "winding" con- 
straint (top) and cyclization constraint (bottom). 



with histones to form nucleosomes. 

Our aim here is to investigate the influence of LRC 
structural disorder on the thermodynamical properties of 
semi-flexible chains like DNA when constrained locally to 
form a loop of size I much smaller than the chain length 
L. Because of the approximate planarity of nucleosomal 
DNA loops, one will assume the chains to be confined in a 
plane and to be free of any twisting deformation. Within 
the linear elasticity approximation, the local elastic en- 
ergy variation of a 2D semi-flexible chain is: 



SE(s) = A(9(s) - 9 (s)) 2 /2, 



(1) 



where A is the bending stiffness, 6(s) the local curvature 
and o (s) the local "spontaneous" curvature of the chain. 
To model the intrinsic quenched (T = 0) disorder, we 
consider o (s) as the realization of a gaussian fractional 
noise of zero mean and variance a\ and such that the 
corresponding random walk A0 o (s, I) — o (u)du ex- 
hibits normal fluctuations characterized by: 
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A6 ( S , I) = 0, A62( s , I) - A6 (s, I) = a 2 J 
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FIG. 2: Free energy of a single loop of size I in a chain of length 
L — 10 s under the "winding" constraint and for A = 200, 
a — 0.01 and a — 2ir. (a) Energy landscape along a chain 
with uncorrelated (H = 0.5, black) and LRC (H = 0.8, grey) 
disorders, (b) Free energy r.m.s. fluctuations A(l) vs. I; the 
symbols correspond to numerical estimate for different disor- 
ders: H = 0.5 (o), 0.6 (□), 0.7 (*), 0.8 (A) and 0.9 (O); the 
solid lines correspond to Eq. (c) Reduced correla- 

tion function C(y, 200)/A(200) vs. y; the symbols have 
the same meaning as in (b); the solid curves correspond 
to Eq. ©. 

where H is called the Hurst exponent [12, EH: when 
H = 1/2, one recovers the standard uncorrelated gaus- 
sian noise, and for H > 1/2, the distribution of the in- 
trinsic curvature along the chain is LRC. As illustrated in 
Fig-HJ due to the persistence of the orientation's fluctua- 
tions, LRC 2D spontaneous trajectories are more looped 
than the uncorrelated ones. 

To account for the spontaneous formation of loop of 
size I, we will consider chains under the following geo- 
metrical constraints (Fig.[T]): (i) the "winding" constraint 
which amounts to keep fixed the variation of the orienta- 
tion over a length I, J a+l 6{u)du = a, and (ii) the "cycliza- 
tion" constraint where in addition the two extremities are 
held fixed together, J" +l cos(9(u))du = f° +l sin(9(u))du = 
0. Given a chain defined by its spontaneous curva- 
ture distribution, we first compute the ID energy land- 
scape E(s, I) associated to the formation of one loop of 
length I at the position s. Introducing the constraint 
via Lagrange multipliers, the equilibrium configuration 
is obtained by solving the corresponding Euler-Lagrange 
equations. For the "winding" constraint, from the equi- 
librium equations, one gets immediately the shape of the 
constrained chain and the corresponding energy cost 

E{s, I) = A[AQ 2 (s, I) - 2aA8 (s, + a 2 ]/2l. (3) 

In Fig. Ufa) are shown the energy landscapes for I = 200 
of an uncorrelated and a LRC chains; the fluctuations 
of the later are of much larger amplitude than those of 
the former. In the weak disorder (WD) limit (a 2 <C 1), 
the statistics of the energy landscape is gaussian; when 
using Eq. @, one gets for the mean E{1) = A[a 2 /l + 
a^ 2H - 1 ]/2 and the variance (E{1) -E(l)) 2 = a 2 A 2 a 2 l 2H - 2 . 
For the "cyclization" constraint there is no such general 
analytic derivation of the equilibrium configuration and 
one has to turn back to numerical computations. As 
in we have used an iterative scheme to perform nu- 
merical computations for several values of a, H, a Q and 



/. In the WD limit, the equilibrium energy fluctuations 
numerically obtained with the "cyclization" constraint, 
display gaussian statistics with the same mean and vari- 
ance as previously derived with the "winding" constraint. 

At finite temperature, one has to consider the ef- 
fect of thermal fluctuations which requires to compute 
the free energy cost of the loop formation f3f(s,l) = 
j3E{s,l) - AS(s,l), where /3 = l/k B T. Under harmonic 
approximation, the entropy cost, AS(l) = b — clni, can be 
computed analytically (resp. numerically) for the "wind- 
ing" (resp. "cyclization") constraint c w — 1/2 (resp. 
c c = 7/2). We finally get the following free energy land- 
scape statistical properties in the WD limit: 

m) = ^Y +(72j2H ~ 1] + clnl ~ b > (4) 

f3 2 (f(l)-J(l)) 2 = A 2 o?oll™- 2 A(l). 

But the thermodynamical properties of the system are 
likely to depend on the correlations of the free energy 
landscape. From Eq. ([3]), one gets: 

C(s'-s, I) = f3 2 Sf(s',l)Sf(sJ) = A(l)C H f^-p) - (5) 

where Sf(s,l) = f(s,l) - 7(0 and C H (y) = (\y + 1\ 2H + 
\y — l\ 2H — 2\y\ 2H )/2 is the correlation function of frac- 
tional Brownian motions (fBm) [15J. The results re- 
ported in Fig. [2jb) show that the scaling form (j4]) of 
the free energy r.m.s. fluctuations is well verified for 
weak disorder (a D — 0.01) up to loop size I < 10 3 . As 
shown in Fig. EJJc), the free energy correlation function 
decreases rather fast over a distance of the order /, and 
then much slowly at larger distances (larger H, slower 
the decrease) in good agreement with the asymptotic be- 
havior C H (y) ~ H(2H - l)y 2H - 2 for y -► oo (Eq. ©). 
While the free energy fluctuations are short range corre- 
lated for H = 1/2, they display LRC for H > 1/2. 

The thermodynamics of a single "loop" of size I em- 
bedded in a chain of length L is described by the par- 
tition function Z(l,L) — J Q L exp [— f3f(s, l)]ds, which 
accounts for all the possible locations of the loop along 
the chain. The equilibrium properties are determined by 
the free energy of the system (relatively to the uncon- 
strained state of the chain): (3!F(l,L) — — \n(Z(l,L)). 
The thermodynamics associated to rugged energy land- 
scapes have been widely studied during the past decades. 
The equilibrium and non equilibruum properties depend 
upon the statistics of energy fluctuations. When no cor- 
relations are present, it is the well known Random En- 
ergy Model (REM) that can be solved exactly [16] . This 
model presents a freezing phase transition separating a 
self- averaging "high temperature" (HT) phase where the 
"constraint" can explore all the possible configurations 
(positions) and a "low temperature" (LT) phase domi- 
nated by the few lowest energy minima where the "con- 
straint" is likely to be localized [l7[. But we have seen 
in Eq. (O that the loop free energy fluctuations are LRC 
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which may question the pertinence of the REM. In the 
HT/WD limit, 0Sf(s,l) < 1, Vs, one gets for finite L: 

PHI) ^ 0J(D - In (£ - - y^7(0 + \C{1, L), (6) 

f3 2 (T{l) -7{l)) 2 = C(l,L) = ^jjC(s-s',l)dsds'. 

The correlations control the sample-to-sample fluctu- 
ations. An explicit computation gives for both the 
"winding" and "cyclization" constraints: C(l,L) ~ 
K(l)(L/l)) 2 2 oc L 2H ~ 2 . The correlations vanish in- 
dependently of I, in the thermodynamic limit L — > oo 
leading to the asymptotic validity of the REM |18| . Com- 
bining Eqs. ((U and (J6|), one gets in the HT/WD phase: 

A n 2 \ n 2 A2 2 

m) = ^+^l 2H - l -^-°ll 2H - 2 +c\nl-b-\nL. 

(7) 

In Figs. (3][a,b) are reported the evolution of the free en- 
ergy of the single loop system vs. the size of the "cy- 
clization" constraint for L = 15000, A = 200, a = 2tt 
and a disorder amplitude a = 0.01 (0.05) comparable 
to that obtained when using experimentally established 
structural tables [H[). The symbols correspond to exact 
numerical estimation of the free energy for five values 
of H that amount to strengthen LRC while the contin- 
uous curves correspond to the corresponding quenched 
free energy J-h(1, L) averaged over 100 single loop chains. 
From both numerical and analytical results, one can ex- 
tract the following main messages: (i) In the absence of 
disorder (cr = 0), the "pure" system has a free energy 
that presents a minimum for a finite length I* = a 2 A/2c 
(Eq. ©). This optimal length separates the enthalpic 
domain at small scale I characterized by a power law 
decrease of the free energy, and the entropic domain 
at large scale characterized by a logarithmic increase, 
(ii) When one adds some intrinsic uncorrelated disor- 
der (H = 1/2), the Z-dependence of the free energy 
reduces (up to a constant) to an homogeneous "pure" 
case with a renormalized value of the bending flexibility 
A ef f = A(l - Aa 2 ) [H,[li|. Thus there is no qualitative 
difference between an uncorrelated system and an ideal 
one, but introducing disorder decreases the free energy 
(Fig. EIa,b)) and favors the formation of loop of smaller 
size I* — a 2 A e ff/2c. (iii) When considering LRC disor- 
der, then the system no longer behaves as an homoge- 
neous one, but more importantly, in the small scale do- 
main, both the free energy and the optimal length l*(H) 
decrease (Fig. d)) when one increases H. As shown in 
Fig. [31(c) , for a fixed loop size I = A = 200, the quenched 
average free energy provides a good description of the free 
energy of a typical single loop chain for both a = 0.01 
and 0.05. Note that only for a a = 0.01 and value of 
H < 0.7, these results are well accounted by the HT/WD 
approximation (Eq. 10)). Similar results are obtained 
in Fig. [3Jd) for the optimal loop length l*(H) which is 
shown to decrease down to values about a few hundreds 
when increasing H from 0.5 to 0.9. For <j = 0.01, the 
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FIG. 3: Free energy of the single loop system Th{1,L) vs I 
under the "cyclization" constraint and for L — 15000, A = 
200, a = 2tt and a a = 0.01 (a) and 0.05 (b). The symbols 
correspond to a single chain's free energy Th{1,L) obtained 
from the (exact) numerical computation of f(s, I) for H — 0.5 
(o, •), 0.6 (□, ■ ), 0.7 ("m-,*), 0.8 (A, A) and 0.9 (O, ♦); 
the ( x ) correspond to the "pure" case without disorder. The 
continuous curves stand for the corresponding quenched free 
energies J-h{1, L) averaged over 100 "typical" single chain free 
energies computed using Eq. (|3]) for the enthalpic part (i.e. 
the "winding" energy) and c c = 7/2 for the entropy cost; the 
dotted curve correspond to the exact analytical expression 
for the "pure" case, (c) J~h(1) ~ (I) vs H between LRC 
and uncorrelated chains, for loop size I = 200; the dashed 
curves correspond to perturbative approximation, (d) Opti- 
mal loop length l*(H) vs H; the dashed curves correspond to 
the perturbative expression (|8]); the horizontal line indicates 
the optimal loop length I* — 1128 for the "pure" system. In 
(c) and (d) the symbols and the continuous curves have the 
same meaning as in (a) and (b) 

solution of the HT/WT perturbative equation: 

{2H-l)Aa 2 J 2H +2cl-{2H-2)a 2 A 2 a 2 l 2H - 1 = Aa 2 , (8) 

provides a rather good description of the -ff-dependence 
of the loop size l*(H) of a typical single loop chain. 
The perturbative expression of the free energy (Eq. ([7])) 
breaks down when the energy fluctuations become too 
large: this is the freezing transition towards the low 
temperature/strong disorder phase where the replica ap- 
proach needs to be used to get the correct quenched free 
energy [l7j . The computation of the localized states is 
not the purpose of this letter since as shown in Fig. [3l 
for parameter values compatible with DNA characteristic 
properties, namely A = 200, a = 2ir and a ~ 0.01, the 
HT/WD approximation is likely to apply. 

As emphasized in Ref. [20] , a convenient formalism to 
investigate diffusion process in the random ID potential 
E(s, I) of the single fixed length loop is that of mean 
first passage time (MFPT). The MFPT (as expressed in 
number of elementary steps) at the position (starting 



4 




10 3 10 5 10 5 10 s 10 7 10 1 ID 2 



t(N) N 

FIG. 4: MFPT r(N) (measured in number of steps) for A = 
I = 200, a = 2ty, L = 15000 and a a = 0.01. (a) Pdf of 
t(N) calculated for 100000 uncorrected (H = 0.5, black) 
and LRC (H = 0.8, grey) chains for N = 100 (thin) and 
200 (thick), (b) Most probable MFPT vs N for H = 0.5 (o) 
and 0.8 (A); mean MFPT vs N for H = 0.6 (■) and 0.8 
(A); the dashed curves correspond to the analytical quenched 
average (Eq. pop ); the dotted curves correspond to the small 
displacement approximation (Eq. HI])); the continuous line 
correspond to the "pure" diffusion case t(N) = TV 2 . 

from s = 0) is given by: 

pN pN 

t(N,1)~2 ds ds' exp[20(E(s) - E(s'))]. (9) 

Jo J s 

The average over all possible realizations of the disor- 
dered energy landscape leads to: 

r(N,l) .2 f da f ds'e 4 ^^^) . (10) 

Jo J s 

When looking at displacements smaller or of the order 
of the loop size, N < I, then the typical energy barrier 
increases like AE(N) ~ N H : the energy landscape has a 
fBm structure. For (s' — s)/l •< 1, Eq. (fTOf reduces to: 

t(N, I) ~ JV 2 eTOwI A(l)(,V/ ' )2fl . (11) 

We thus get a stretched exponential creep that depends 
on H. For H = 1/2, one recovers the exponential creep 
of the Random Force Model (RFM) with logarithmically 
slow ( "Sinai" ) diffusion |2l], [22] • When strengthening the 
LRC by increasing H > 1/2, one further increases t(N, I) 



suggesting some slowing down of the loop dynamics. In 
Fig. Hlb) , this modified Sinai diffusion [21[ accounts quite 
well for the short distance dynamics of single loop chain 
realizations. But as shown in Fig.[5fa), when computing 
the probability density function (pdf) of the MFPT over 
100000 realizations for distances N <2l, the way the av- 
erage MFPT depends on H is very much affected by the 
evolution of the pdf tail and does not reflect the depen- 
dence of the most probable MFPT (as defined by the pdf 
maximum) which in contrast decreases when increasing 
H. This shows that for a typical event, the motion of a 
single loop in a LRC chain over distances of the order of 
its size is definitely superdiffusive (see inset Fig. 0Jb)), 
larger H , faster the dynamics. 

To summarize, we have shown that the competing ef- 
fects of entropy and sequence dependent structural dis- 
order favors the autonomous formation of DNA loops. 
When taking into account the existence of LRC as ob- 
served in eukaryotic genomic sequences (l2| . we have 
found, in the WD limit, that strengthening LRC allows 
the formation of smaller loops that superdiffuse, larger 
the LRC, faster the typical local loop dynamics. These 
results strongly suggest that these LRC may have been 
encoded into genomic sequences during evolution to pre- 
dispose eukaryotic DNA to interact with histones to form 
nucleosomes. The size of the selected loops (few hun- 
dreds bp) are typical of the characteristic DNA which is 
wrapped around histones; we refer the reader to a recent 
work of Bussiek et al. |23[ where in high salt concen- 
tration conditions, the nucleosomes are observed to be 
preferentially located at the crossing of DNA loops of 
characteristic length ~ 200 bp (50 nm). The local rapid 
diffusion of the loop induced by the LRC structural dis- 
order provides a very attractive interpretation to the nu- 
cleosome repositioning dynamics. LRC are likely to help 
the nucleosomes to rearrange themselves in a very effi- 
cient way as, e.g. after the passage of the transcription 
and replication polymerases. Since in in vivo chromatin, 
the nucleosomal string presents a high occupation den- 
sity with an average distance between nucleosomes of the 
order of 50 bp, this raises the issue of the effect of the 
interaction between nucleosomes on their large scale mo- 
bility. The generalization of the present work to multiple 
2D loops in a long LRC DNA chain is in current progress. 
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